Google Page Entity Extraction Template
工作流概述
这是一个包含6个节点的中等工作流,主要用于自动化处理各种任务。
工作流源代码
{
"id": "4wPgPbxtojrUO7Dx",
"meta": {
"instanceId": "f46651348590f9c7e3e7fe91218ed49590c553ab737d5cc247951397ff85fa93"
},
"name": "Google Page Entity Extraction Template",
"tags": [
{
"id": "hBkrfz3jN0GbUgJa",
"name": "Google Page Entity Extraction Template",
"createdAt": "2025-05-08T23:29:39.011Z",
"updatedAt": "2025-05-08T23:29:39.011Z"
}
],
"nodes": [
{
"id": "8719f1de-2a3e-4c34-9edc-e4b8f993b525",
"name": "Respond to Webhook",
"type": "n8n-nodes-base.respondToWebhook",
"position": [
1240,
-420
],
"parameters": {
"options": {}
},
"typeVersion": 1.1
},
{
"id": "01420fd5-3483-4e74-b9fc-971199898449",
"name": "Google Entities",
"type": "n8n-nodes-base.httpRequest",
"position": [
1020,
-420
],
"parameters": {
"url": "https://language.googleapis.com/v1/documents:analyzeEntities",
"method": "POST",
"options": {},
"jsonBody": "={{ $json.apiRequest }}",
"sendBody": true,
"sendQuery": true,
"sendHeaders": true,
"specifyBody": "json",
"queryParameters": {
"parameters": [
{
"name": "key",
"value": "YOUR-GOOGLE-API-KEY"
}
]
},
"headerParameters": {
"parameters": [
{
"name": "Content-Type",
"value": "application/json"
}
]
}
},
"typeVersion": 4.2
},
{
"id": "5c1c258a-44ed-4d5a-a22d-cddb4df09018",
"name": "Sticky Note",
"type": "n8n-nodes-base.stickyNote",
"position": [
-300,
-700
],
"parameters": {
"color": 4,
"width": 620,
"height": 880,
"content": "# Google Page Entity Extraction Template
## What this workflow does
This workflow allows you to extract named entities (people, organizations, locations, etc.) from any web page using Google's Natural Language API. Simply send a URL to the webhook endpoint, and the workflow will fetch the page content, process it through Google's entity recognition service, and return the structured entity data.
### How to use
1. Replace \"YOUR-GOOGLE-API-KEY\" with your actual Google Cloud API key (Natural Language API must be enabled)
2. Activate the workflow and use the webhook URL as your endpoint
3. Send a POST request to the webhook with a JSON body containing the URL you want to analyze: {\"url\": \"https://example.com/page\"}
4. Review the returned entity analysis with categories, salience scores, and metadata
## Webhook Input Format
The webhook expects a POST request with a JSON body in this format:
```json
{
\"url\": \"https://website-to-analyze.com/page\"
}
```
### Response Format
The webhook returns a JSON response containing the full entity analysis from Google's Natural Language API, including:
Entity names and types (PERSON, LOCATION, ORGANIZATION, etc.)
Salience scores indicating entity importance
Metadata and mentions within the text
Entity sentiment (if available)"
},
"typeVersion": 1
},
{
"id": "79add9a7-adca-4ce5-8a6a-5fcb75288846",
"name": "Get Url",
"type": "n8n-nodes-base.webhook",
"position": [
360,
-420
],
"webhookId": "2944c8f6-03cd-4ab8-8b8e-cb033edf877a",
"parameters": {
"path": "2944c8f6-03cd-4ab8-8b8e-cb033edf877a",
"options": {},
"httpMethod": "POST",
"responseMode": "responseNode"
},
"typeVersion": 2
},
{
"id": "081a52bc-2da7-44fb-bdc3-4cb73cbf8dd3",
"name": "Get URL Page Contents",
"type": "n8n-nodes-base.httpRequest",
"position": [
580,
-420
],
"parameters": {
"url": "={{ $json.body.url }}",
"options": {}
},
"typeVersion": 4.2
},
{
"id": "dda5ef3d-f031-4dd6-b117-c1f69aa66b63",
"name": "Respond with detected entities",
"type": "n8n-nodes-base.code",
"position": [
800,
-420
],
"parameters": {
"jsCode": "// Clean and prepare HTML for API request
const html = $input.item.json.data;
// Trim if too large (optional)
const trimmedHtml = html.length > 100000 ? html.substring(0, 100000) : html;
return {
json: {
apiRequest: {
document: {
type: \"HTML\",
content: trimmedHtml
},
encodingType: \"UTF8\"
}
}
}"
},
"typeVersion": 2
}
],
"active": false,
"pinData": {},
"settings": {
"executionOrder": "v1"
},
"versionId": "432203af-190a-4a89-81d8-f86682a0b63f",
"connections": {
"Get Url": {
"main": [
[
{
"node": "Get URL Page Contents",
"type": "main",
"index": 0
}
]
]
},
"Google Entities": {
"main": [
[
{
"node": "Respond to Webhook",
"type": "main",
"index": 0
}
]
]
},
"Get URL Page Contents": {
"main": [
[
{
"node": "Respond with detected entities",
"type": "main",
"index": 0
}
]
]
},
"Respond with detected entities": {
"main": [
[
{
"node": "Google Entities",
"type": "main",
"index": 0
}
]
]
}
}
}
功能特点
- 自动检测新邮件
- AI智能内容分析
- 自定义分类规则
- 批量处理能力
- 详细的处理日志
技术分析
节点类型及作用
- Respondtowebhook
- Httprequest
- Stickynote
- Webhook
- Code
复杂度评估
配置难度:
维护难度:
扩展性:
实施指南
前置条件
- 有效的Gmail账户
- n8n平台访问权限
- Google API凭证
- AI分类服务订阅
配置步骤
- 在n8n中导入工作流JSON文件
- 配置Gmail节点的认证信息
- 设置AI分类器的API密钥
- 自定义分类规则和标签映射
- 测试工作流执行
- 配置定时触发器(可选)
关键参数
| 参数名称 | 默认值 | 说明 |
|---|---|---|
| maxEmails | 50 | 单次处理的最大邮件数量 |
| confidenceThreshold | 0.8 | 分类置信度阈值 |
| autoLabel | true | 是否自动添加标签 |
最佳实践
优化建议
- 定期更新AI分类模型以提高准确性
- 根据邮件量调整处理批次大小
- 设置合理的分类置信度阈值
- 定期清理过期的分类规则
安全注意事项
- 妥善保管API密钥和认证信息
- 限制工作流的访问权限
- 定期审查处理日志
- 启用双因素认证保护Gmail账户
性能优化
- 使用增量处理减少重复工作
- 缓存频繁访问的数据
- 并行处理多个邮件分类任务
- 监控系统资源使用情况
故障排除
常见问题
邮件未被正确分类
检查AI分类器的置信度阈值设置,适当降低阈值或更新训练数据。
Gmail认证失败
确认Google API凭证有效且具有正确的权限范围,重新进行OAuth授权。
调试技巧
- 启用详细日志记录查看每个步骤的执行情况
- 使用测试邮件验证分类逻辑
- 检查网络连接和API服务状态
- 逐步执行工作流定位问题节点
错误处理
工作流包含以下错误处理机制:
- 网络超时自动重试(最多3次)
- API错误记录和告警
- 处理失败邮件的隔离机制
- 异常情况下的回滚操作